DNN-Based Feature Enhancement Using Joint Training Framework for Robust Multichannel Speech Recognition
نویسندگان
چکیده
Ever since the deep neural network (DNN) appeared in the speech signal processing society, the recognition performance of automatic speech recognition (ASR) has been greatly improved. Due to this achievement, the demands on various applications in distant-talking environment also have been increased. However, ASR performance in such environments is still far from that in close-talking environments due to various problems. In this paper, we propose a novel multichannel-based feature mapping technique combining conventional beamformer, DNN and its joint training scheme. Through the experiments using multichannel wall street journal audio visual (MC-WSJAV) corpus, it has been shown that the proposed technique models the complicated relationship between the array inputs and clean speech features effectively via employing intermediate target. The proposed method outperformed the conventional DNN system.
منابع مشابه
The Ntu-adsc Systems for Reverberation Challenge
This paper describes our speech enhancement and recognition systems developed for the Reverberation Challenge 2014. To enhance the noisy and reverberant speech for human listening, besides using conventional methods such as delay and sum beamformer and late reverberation reduction by spectral subtraction, we also studied a novel learning-based speech enhancement. Specifically, we train deep neu...
متن کاملThe Ntu - Adsc Systems for Reverberation Challenge 2014
This paper describes our speech enhancement and recognition systems developed for the Reverberation Challenge 2014. To enhance the noisy and reverberant speech for human listening, besides using conventional methods such as delay and sum beamformer and late reverberation reduction by spectral subtraction, we also studied a novel learning-based speech enhancement. Specifically, we train deep neu...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملJoint Training of Multi-Channel-Condition Dereverberation and Acoustic Modeling of Microphone Array Speech for Robust Distant Speech Recognition
We propose a novel data utilization strategy, called multichannel-condition learning, leveraging upon complementary information captured in microphone array speech to jointly train dereverberation and acoustic deep neural network (DNN) models for robust distant speech recognition. Experimental results, with a single automatic speech recognition (ASR) system, on the REVERB2014 simulated evaluati...
متن کاملDeep neural network based spectral feature mapping for robust speech recognition
Automatic speech recognition (ASR) systems suffer from performance degradation under noisy and reverberant conditions. In this work, we explore a deep neural network (DNN) based approach for spectral feature mapping from corrupted speech to clean speech. The DNN based mapping substantially reduces interference and produces estimated clean spectral features for ASR training and decoding. We expe...
متن کامل